Goto

Collaborating Authors

 computational pathology




When normalization hallucinates: unseen risks in AI-powered whole slide image processing

Moens, Karel, Blaschko, Matthew B., Tuytelaars, Tinne, Diricx, Bart, De Vylder, Jonas, Yousif, Mustafa

arXiv.org Artificial Intelligence

Whole slide image (WSI) normalization remains a vital preprocessing step in computational pathology. Increasingly driven by deep learning, these models learn to approximate data distributions from training examples. This often results in outputs that gravitate toward the average, potentially masking diagnostically important features. More critically, they can introduce hallucinated content, artifacts that appear realistic but are not present in the original tissue, posing a serious threat to downstream analysis. These hallucinations are nearly impossible to detect visually, and current evaluation practices often overlook them. In this work, we demonstrate that the risk of hallucinations is real and underappreciated. While many methods perform adequately on public datasets, we observe a concerning frequency of hallucinations when these same models are retrained and evaluated on real-world clinical data. To address this, we propose a novel image comparison measure designed to automatically detect hallucinations in normalized outputs. Using this measure, we systematically evaluate several well-cited normalization methods retrained on real-world data, revealing significant inconsistencies and failures that are not captured by conventional metrics. Our findings underscore the need for more robust, interpretable normalization techniques and stricter validation protocols in clinical deployment.


Contrastive Integrated Gradients: A Feature Attribution-Based Method for Explaining Whole Slide Image Classification

Vu, Anh Mai, Vo, Tuan L., Bui, Ngoc Lam Quang, Binh, Nam Nguyen Le, Awasthi, Akash, Vo, Huy Quoc, Nguyen, Thanh-Huy, Han, Zhu, Mohan, Chandra, Van Nguyen, Hien

arXiv.org Artificial Intelligence

Interpretability is essential in Whole Slide Image (WSI) analysis for computational pathology, where understanding model predictions helps build trust in AI-assisted diagnostics. While Integrated Gradients (IG) and related attribution methods have shown promise, applying them directly to WSIs introduces challenges due to their high-resolution nature. These methods capture model decision patterns but may overlook class-discriminative signals that are crucial for distinguishing between tumor subtypes. In this work, we introduce Contrastive Integrated Gradients (CIG), a novel attribution method that enhances interpretability by computing contrastive gradients in logit space. First, CIG highlights class-discriminative regions by comparing feature importance relative to a reference class, offering sharper differentiation between tumor and non-tumor areas. Second, CIG satisfies the axioms of integrated attribution, ensuring consistency and theoretical soundness. Third, we propose two attribution quality metrics, MIL-AIC and MIL-SIC, which measure how predictive information and model confidence evolve with access to salient regions, particularly under weak supervision.


Synergy vs. Noise: Performance-Guided Multimodal Fusion For Biochemical Recurrence-Free Survival in Prostate Cancer

Chang, Seth Alain, Amjad, Muhammad Mueez, Wahab, Noorul, Alzaid, Ethar, Rajpoot, Nasir, Shephard, Adam

arXiv.org Artificial Intelligence

Multimodal deep learning (MDL) has emerged as a transformative approach in computational pathology. By integrating complementary information from multiple data sources, MDL models have demonstrated superior predictive performance across diverse clinical tasks compared to unimodal models. However, the assumption that combining modalities inherently improves performance remains largely unexamined. We hypothesise that multimodal gains depend critically on the predictive quality of individual modalities, and that integrating weak modalities may introduce noise rather than complementary information. We test this hypothesis on a prostate cancer dataset with histopathology, radiology, and clinical data to predict time-to-biochemical recurrence. Our results confirm that combining high-performing modalities yield superior performance compared to unimodal approaches. However, integrating a poor-performing modality with other higher-performing modalities degrades predictive accuracy. These findings demonstrate that multimodal benefit requires selective, performance-guided integration rather than indiscriminate modality combination, with implications for MDL design across computational pathology and medical imaging.


EvoPS: Evolutionary Patch Selection for Whole Slide Image Analysis in Computational Pathology

Hashemian, Saya, Bidgoli, Azam Asilian

arXiv.org Artificial Intelligence

In computational pathology, the gigapixel scale of Whole-Slide Images (WSIs) necessitates their division into thousands of smaller patches. Analyzing these high-dimensional patch embeddings is computationally expensive and risks diluting key diagnostic signals with many uninformative patches. Existing patch selection methods often rely on random sampling or simple clustering heuristics and typically fail to explicitly manage the crucial trade-off between the number of selected patches and the accuracy of the resulting slide representation. To address this gap, we propose EvoPS (Evolutionary Patch Selection), a novel framework that formulates patch selection as a multi-objective optimization problem and leverages an evolutionary search to simultaneously minimize the number of selected patch embeddings and maximize the performance of a downstream similarity search task, generating a Pareto front of optimal trade-off solutions. We validated our framework across four major cancer cohorts from The Cancer Genome Atlas (TCGA) using five pretrained deep learning models to generate patch embeddings, including both supervised CNNs and large self-supervised foundation models. The results demonstrate that EvoPS can reduce the required number of training patch embeddings by over 90% while consistently maintaining or even improving the final classification F1-score compared to a baseline that uses all available patches' embeddings selected through a standard extraction pipeline. The EvoPS framework provides a robust and principled method for creating efficient, accurate, and interpretable WSI representations, empowering users to select an optimal balance between computational cost and diagnostic performance.


Comparing Computational Pathology Foundation Models using Representational Similarity Analysis

Mishra, Vaibhav, Lotter, William

arXiv.org Artificial Intelligence

Foundation models are increasingly developed in computational pathology (CPath) given their promise in facilitating many downstream tasks. While recent studies have evaluated task performance across models, less is known about the structure and variability of their learned representations. Here, we systematically analyze the representational spaces of six CPath foundation models using techniques popularized in computational neuroscience. The models analyzed span vision-language contrastive learning (CONCH, PLIP, KEEP) and self-distillation (UNI (v2), Virchow (v2), Prov-GigaPath) approaches. Through representational similarity analysis using H&E image patches from TCGA, we find that UNI2 and Virchow2 have the most distinct representational structures, whereas Prov-Gigapath has the highest average similarity across models. Having the same training paradigm (vision-only vs. vision-language) did not guarantee higher representational similarity. The representations of all models showed a high slide-dependence, but relatively low disease-dependence. Stain normalization decreased slide-dependence for all models by a range of 5.5% (CONCH) to 20.5% (PLIP). In terms of intrinsic dimensionality, vision-language models demonstrated relatively compact representations, compared to the more distributed representations of vision-only models. These findings highlight opportunities to improve robustness to slide-specific features, inform model ensembling strategies, and provide insights into how training paradigms shape model representations. Our framework is extendable across medical imaging domains, where probing the internal representations of foundation models can support their effective development and deployment.


MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation

Winter, Dominik, Bui, Mai, Gavaldon, Monica Azqueta, Triltsch, Nicolas, Rosati, Marco, Brieu, Nicolas

arXiv.org Artificial Intelligence

Scarcity of annotated data, particularly for rare or atypical morphologies, present significant challenges for cell and nuclei segmentation in computational pathology. While manual annotation is labor-intensive and costly, synthetic data offers a cost-effective alternative. We introduce a Multimodal Semantic Diffusion Model (MSDM) for generating realistic pixel-precise image-mask pairs for cell and nuclei segmentation. By conditioning the generative process with cellular/nuclear morphologies (using horizontal and vertical maps), RGB color characteristics, and BERT-encoded assay/indication metadata, MSDM generates datasests with desired morphological properties. These heterogeneous modalities are integrated via multi-head cross-attention, enabling fine-grained control over the generated images. Quantitative analysis demonstrates that synthetic images closely match real data, with low Wasserstein distances between embeddings of generated and real images under matching biological conditions. The incorporation of these synthetic samples, exemplified by columnar cells, significantly improves segmentation model accuracy on columnar cells. This strategy systematically enriches data sets, directly targeting model deficiencies. We highlight the effectiveness of multimodal diffusion-based augmentation for advancing the robustness and generalizability of cell and nuclei segmentation models. Thereby, we pave the way for broader application of generative models in computational pathology.



GAS-MIL: Group-Aggregative Selection Multi-Instance Learning for Ensemble of Foundation Models in Digital Pathology Image Analysis

Quan, Peiran, Gu, Zifan, Zhao, Zhuo, Zhou, Qin, Yang, Donghan M., Rong, Ruichen, Xie, Yang, Xiao, Guanghua

arXiv.org Artificial Intelligence

Foundation models (FMs) have transformed computational pathology by providing powerful, general - purpose feature extractors. However, adapting and benchmarking individual FMs for specific diagnostic tasks is often time - consuming and resource - intensive, espe cially given their scale and diversity. To address this challenge, we introduce Group - Aggregative Selection Multi - Instance Learning (GAS - MIL), a flexible ensemble framework that seamlessly integrates features from multiple FMs, preserving their complementa ry strengths without requiring manual feature selection or extensive task - specific fine - tuning. Across classification tasks in three cancer datasets -- prostate (PANDA), ovarian (UBC - OCEAN), and breast (TCGA - BrCa) -- GAS - MIL consistently achieves superior or on - par performance relative to individual FMs and established MIL methods, demonstrating its robustness and generalizability. By enabling efficient int egration of heterogeneous FMs, GAS - MIL streamlines model deployment for pathology and provides a scalable foundation for future multimodal and precision oncology applications.